Integration of context-dependent durational knowledge into HMM-based speech recognition
نویسندگان
چکیده
2. DPDF OF STANDARD HMM This paper presents research on integrating context-dependent durational knowledge into HMM-based speech recognition. The first part of the paper presents work on obtaining relations between the parameters of the context-free HMMs and their durational behaviour, in preparation for the context-dependent durational modelling presented in the second part. Duration integration is realised via rescoring in the post-processing step of our N-best monophone recogniser. We use the multi-speaker TIMIT database for our analyses. The single-state dpdf (which is geometrical) is less important than the dpdf of the whole HMM, because in actual practice it is the latter that models a phonetic segment. In this section we firstly derive the closed-form whole-model dpdf for general leftto-right HMM. Left-to-right is by far the most common type of transition topology used for speech recognition. It may include any number of skipping transitions and parallel paths but no feedback loops that contains more than one state. Then an analysis will be given of the general properties of the dpdf, with the help of some examples of useful topologies.
منابع مشابه
Allophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملSuprasegmental duration modelling with elastic constraints in automatic speech recognition
In this paper a method of integrating a model of suprasegmental duration with a HMM-based recogniser at the post-processing level is presented. The N-Best utterance output is rescored using a suitable linear combination of acoustic log-likelihood (provided by a set of tied-state triphone HMMs) and duration log-likelihood (provided by a set of durational models). The durational model used in the...
متن کاملBetter HMM-Based Articulatory Feature Extraction with Context-Dependent Model
The majority of speech recognition systems today commonly use Hidden Markov Models (HMMs) as acoustic models in systems since they can powerfully train and map a speech utterance into a sequence of units. Such systems perform even better if the units are context-dependent. Analogously, when HMM techniques are applied to the problem of articulatory feature extraction, contextdependent articulato...
متن کاملModelling care of articulation with HMMs is dangerous
Changes in care of articulation (COA) affect both the spectral and durational characteristics of speech. This can have severe repercussions on both the success of speech recognition, and the quality of speech synthesis. Although auto-segmentation has proven useful for measuring the durational effects of COA, an automatic spectral measurement has proven more problematic [5]. In this paper, we wi...
متن کاملOff-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996